AITopics | orlicz norm

Collaborating Authors

orlicz norm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sharp Concentration Inequalities: Phase Transition and Mixing of Orlicz Tails with Variance

Shen, Yinan, Lv, Jinchi

arXiv.org Machine LearningMar-30-2026

In this work, we investigate how to develop sharp concentration inequalities for sub-Weibull random variables, including sub-Gaussian and sub-exponential distributions. Although the random variables may not be sub-Guassian, the tail probability around the origin behaves as if they were sub-Gaussian, and the tail probability decays align with the Orlicz $Ψ_α$-tail elsewhere. Specifically, for independent and identically distributed (i.i.d.) $\{X_i\}_{i=1}^n$ with finite Orlicz norm $\|X\|_{Ψ_α}$, our theory unveils that there is an interesting phase transition at $α= 2$ in that $\PPł(ł|\sum_{i=1}^n X_i \r| \geq t\r)$ with $t > 0$ is upper bounded by $2\expł(-C\maxł\{\frac{t^2}{n\|X\|_{Ψ_α}^2},\frac{t^α}{ n^{α-1} \|X\|_{Ψ_α}^α}\r\}\r)$ for $α\geq 2$, and by $2\expł(-C\minł\{\frac{t^2}{n\|X\|_{Ψ_α}^2},\frac{t^α}{ n^{α-1} \|X\|_{Ψ_α}^α}\r\}\r)$ for $1\leq α\leq 2$ with some positive constant $C$. In many scenarios, it is often necessary to distinguish the standard deviation from the Orlicz norm when the latter can exceed the former greatly. To accommodate this, we build a new theoretical analysis framework, and our sharp, flexible concentration inequalities involve the variance and a mixing of Orlicz $Ψ_α$-tails through the min and max functions. Our theory yields new, improved concentration inequalities even for the cases of sub-Gaussian and sub-exponential distributions with $α= 2$ and $1$, respectively. We further demonstrate our theory on martingales, random vectors, random matrices, and covariance matrix estimation. These sharp concentration inequalities can empower more precise non-asymptotic analyses across different statistical and machine learning applications.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

2603.25934

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Higher Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.66)
Information Technology > Modeling & Simulation (0.54)

Add feedback

Efficient Symmetric Norm Regression via Linear Sketching

Zhao Song, Ruosong Wang, Lin Yang, Hongyang Zhang, Peilin Zhong

Neural Information Processing SystemsFeb-12-2026, 17:26:28 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, loss function, orlicz norm, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Add feedback

Efficient Symmetric Norm Regression via Linear Sketching

Neural Information Processing SystemsDec-25-2025, 15:17:13 GMT

We provide efficient algorithms for overconstrained linear regression problems with size $n \times d$ when the loss function is a symmetric norm (a norm invariant under sign-flips and coordinate-permutations). An important class of symmetric norms are Orlicz norms, where for a function $G$ and a vector $y \in \mathbb{R}^n$, the corresponding Orlicz norm $\|y\|_G$ is defined as the unique value $\alpha$ such that $\sum_{i=1}^n G(|y_i|/\alpha) = 1$. When the loss function is an Orlicz norm, our algorithm produces a $(1 + \varepsilon)$-approximate solution for an arbitrarily small constant $\varepsilon > 0$ in input-sparsity time, improving over the previously best-known algorithm which produces a $d \cdot \polylog n$-approximate solution. When the loss function is a general symmetric norm, our algorithm produces a $\sqrt{d} \cdot \polylog n \cdot \mathrm{mmc}(\ell)$-approximate solution in input-sparsity time, where $\mathrm{mmc}(\ell)$ is a quantity related to the symmetric norm under consideration. To the best of our knowledge, this is the first input-sparsity time algorithm with provable guarantees for the general class of symmetric norm regression problem. Our results shed light on resolving the universal sketching problem for linear regression, and the techniques might be of independent interest to numerical linear algebra problems more broadly.

algorithm, efficient symmetric norm regression, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.62)

Add feedback

Efficient Symmetric Norm Regression via Linear Sketching

Zhao Song, Ruosong Wang, Lin Yang, Hongyang Zhang, Peilin Zhong

Neural Information Processing SystemsOct-3-2025, 02:07:08 GMT

Linear regression is a fundamental problem in machine learning.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Add feedback

7eacb532570ff6858afd2723755ff790-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 02:06:55 GMT

algorithm, artificial intelligence, dimension, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.79)

Add feedback

Efficient Symmetric Norm Regression via Linear Sketching

Neural Information Processing SystemsOct-10-2024, 09:13:44 GMT

We provide efficient algorithms for overconstrained linear regression problems with size n \times d when the loss function is a symmetric norm (a norm invariant under sign-flips and coordinate-permutations). An important class of symmetric norms are Orlicz norms, where for a function G and a vector y \in \mathbb{R} n, the corresponding Orlicz norm \ y\ _G is defined as the unique value \alpha such that \sum_{i 1} n G( y_i /\alpha) 1 . When the loss function is an Orlicz norm, our algorithm produces a (1 \varepsilon) -approximate solution for an arbitrarily small constant \varepsilon 0 in input-sparsity time, improving over the previously best-known algorithm which produces a d \cdot \polylog n -approximate solution. When the loss function is a general symmetric norm, our algorithm produces a \sqrt{d} \cdot \polylog n \cdot \mathrm{mmc}(\ell) -approximate solution in input-sparsity time, where \mathrm{mmc}(\ell) is a quantity related to the symmetric norm under consideration. To the best of our knowledge, this is the first input-sparsity time algorithm with provable guarantees for the general class of symmetric norm regression problem.

algorithm, efficient symmetric norm regression, linear sketching, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Instance-dependent uniform tail bounds for empirical processes

Bahmani, Sohail

arXiv.org Machine LearningMay-9-2023

We formulate a uniform tail bound for empirical processes indexed by a class of functions, in terms of the individual deviations of the functions rather than the worst-case deviation in the considered class. The tail bound is established by introducing an initial "deflation" step to the standard generic chaining argument. The resulting tail bound has a main complexity component, a variant of Talagrand's $\gamma$ functional for the deflated function class, as well as an instance-dependent deviation term, measured by an appropriately scaled version of a suitable norm. Both of these terms are expressed using certain coefficients formulated based on the relevant cumulant generating functions. We also provide more explicit approximations for the mentioned coefficients, when the function class lies in a given (exponential type) Orlicz space.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

2209.10053

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Tailoring to the Tails: Risk Measures for Fine-Grained Tail Sensitivity

Fröhlich, Christian, Williamson, Robert C.

arXiv.org Artificial IntelligenceJan-23-2023

Expected risk minimization (ERM) is at the core of many machine learning systems. This means that the risk inherent in a loss distribution is summarized using a single number - its average. In this paper, we propose a general approach to construct risk measures which exhibit a desired tail sensitivity and may replace the expectation operator in ERM. Our method relies on the specification of a reference distribution with a desired tail behaviour, which is in a one-to-one correspondence to a coherent upper probability. Any risk measure, which is compatible with this upper probability, displays a tail sensitivity which is finely tuned to the reference distribution. As a concrete example, we focus on divergence risk measures based on f-divergence ambiguity sets, which are a widespread tool used to foster distributional robustness of machine learning systems. For instance, we show how ambiguity sets based on the Kullback-Leibler divergence are intimately tied to the class of subexponential random variables. We elaborate the connection between divergence risk measures and rearrangement invariant Banach norms.

artificial intelligence, machine learning, risk measure, (18 more...)

arXiv.org Artificial Intelligence

2208.03066

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Banking & Finance (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Efficient Symmetric Norm Regression via Linear Sketching

Song, Zhao, Wang, Ruosong, Yang, Lin, Zhang, Hongyang, Zhong, Peilin

Neural Information Processing SystemsMar-18-2020, 20:45:29 GMT

We provide efficient algorithms for overconstrained linear regression problems with size $n \times d$ when the loss function is a symmetric norm (a norm invariant under sign-flips and coordinate-permutations). An important class of symmetric norms are Orlicz norms, where for a function $G$ and a vector $y \in \mathbb{R} n$, the corresponding Orlicz norm $\ y\ _G$ is defined as the unique value $\alpha$ such that $\sum_{i 1} n G( y_i /\alpha) 1$. When the loss function is an Orlicz norm, our algorithm produces a $(1 \varepsilon)$-approximate solution for an arbitrarily small constant $\varepsilon 0$ in input-sparsity time, improving over the previously best-known algorithm which produces a $d \cdot \polylog n$-approximate solution. When the loss function is a general symmetric norm, our algorithm produces a $\sqrt{d} \cdot \polylog n \cdot \mathrm{mmc}(\ell)$-approximate solution in input-sparsity time, where $\mathrm{mmc}(\ell)$ is a quantity related to the symmetric norm under consideration. To the best of our knowledge, this is the first input-sparsity time algorithm with provable guarantees for the general class of symmetric norm regression problem.

algorithm, efficient symmetric norm regression, linear sketching, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback